AMENDMENT AND RESPONSE UNDER 37 CFR § 1.111
Serial Number: 10/643,587
Filing Date: August 18, 2003
Title: SCHEDULING SYNCHRONIZATION OF PROGRAMS RUNNING AS STREAMS ON MULTIPLE PROCESSORS

# IN THE SPECIFICATION

# Please amend the paragraph beginning on page 1 at line 9, as follows:

| This application is rela                                                                      | ated to U.S. Patent Application 1          | No. [[                          | ]]                  |
|-----------------------------------------------------------------------------------------------|--------------------------------------------|---------------------------------|---------------------|
| 10/643,769, entitled "SCHED                                                                   | ULING SYNCHRONIZATION                      | OF PROGRAMS                     | RUNNING AS          |
| STREAMS ON MULTIPLE                                                                           | PROCESSORS", filed on even of              | late herewith; U.S. 1           | Patent              |
| Application No. [[                                                                            | ]] <u>10/643,744</u> , entitled '          | "Multistream Proces             | ssing System        |
| and Method", filed on even da                                                                 | ate herewith; to U.S. Patent App           | lication No. [[                 | ]]                  |
| 10/643,577, entitled "System                                                                  | and Method for <del>Synchronizing</del> ]  | Processing Memory               | <u>Instructions</u> |
| Transfers", Serial No. [[                                                                     | ,]] filed on even d                        | late herewith; to U.S           | S. Patent           |
|                                                                                               | ]] <u>10/643,742</u> , entitled '          |                                 |                     |
| Data in a Multiprocessor System", filed on even date herewith; to U.S. Patent Application No. |                                            |                                 |                     |
| [[]] 10/64                                                                                    | 3,586, entitled "Decoupled <del>Vect</del> | or Scalar/Vector Co             | omputer             |
| Architecture System and Metl                                                                  | nod (as amended)", filed on ever           | n date herewith; to U           | J.S. Patent         |
| Application No. [[                                                                            | ]] <u>10/643,585</u> , entitled '          | 'Latency Tolerant D             | istributed          |
| Shared Memory Multiprocess                                                                    | or Computer", filed on even dat            | e herewith; to U.S. l           | Patent              |
| Application No. [[                                                                            | ]] <u>10/643,754</u> , entitled '          | 'Relaxed Memory C               | Consistency         |
| Model", filed on even date he                                                                 | rewith; to U.S. Patent Application         | on No.[[                        | ]],                 |
| 10/643,758 entitled "Remote"                                                                  | Translation Mechanism for a Mu             | ıltinode System", fi            | led on even         |
| date herewith; and to U.S. Pat                                                                | ent Application No. [[                     | ]] <u>10/643</u> ,              | ,741, entitled      |
| "Method and Apparatus for L                                                                   | ocal Synchronizations in a Vect            | ə <del>r Processor System</del> | Multistream         |
| Processing Memory-And Barrier-Synchronization Method and Apparatus", filed on even date       |                                            |                                 |                     |
| herewith, each of which is incorporated herein by reference.                                  |                                            |                                 |                     |
|                                                                                               |                                            |                                 |                     |

# Please amend the three paragraphs beginning on page 3, line 5, as follows:

One aspect of the system and method is that a process is stated in an operating system. Additionally, a plurality of program units associated with the process are started. When a context shifting event occurs, each of the plurality of program units has their scheduling synchronized and their context set so that each thread processes the context shifting event.

A further aspect of the system is that some program units may be executing on more than one multiple processor unit. In the <u>The</u> operating system selects a multiple processor unit to host all of the program units, and migrates those program units that are not currently on the selected multiple processor unit to the selected multiple processor unit.

The present invention application describes systems, clients, servers, methods, and computer-readable media of varying scope. In addition to the aspects and advantages of the present embodiments of the invention described in this summary, further aspects and advantages of the embodiments of the invention will become apparent by reference to the drawings and by reading the detailed description that follows.

#### Please amend the paragraph beginning on page 4, line 2, as follows:

In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings which form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention inventive subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments of the invention, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical and other changes may be made without departing from the scope of the present invention inventive subject matter.

### Please amend the paragraph beginning on page 4, line 28, as follows:

In the Figures, the same reference number is used throughout to refer to an identical component which appears in multiple Figures. Signals and connections may be referred to by the same reference number or label, and the actual meaning will be clear from its use in the context of the description. Further, the same base reference number (e.g. [[120]] 102) used in the

Dkt: 1376.717US1

specification and figures when generically referring to the actions or characteristics of a group of identical components. A numeric index introduced by a decimal point (e.g. [[120.1]] 102.1) is used when a specific component among the group of identical components performs an action or has a characteristic.

# Please amend the paragraph beginning on page 5, line 11, as follows:

FIG. 1 is a block diagram of parallel processing hardware and operating environment 100 in which different embodiments of the invention can be practiced. In some embodiments, environment 100 comprises a node 101 which includes two or more multiple processor units 102. Although two multiple processor units 102.1 and 102.2 are shown in FIG. 1, it will be appreciated by those of skill in the art that other number numbers of multiple processor units may be incorporated in environment 100 and in configurations other than in a node 101. In some embodiments of the invention, node 101 may include up to four multiple processor units 102. Each of the multiple processor units 102 on node 101 has access to node memory 108. In some embodiments, node 101 is a single printed circuit board and node memory 108 comprises daughter cards insertable on the circuit board.

# Please amend the two paragraphs beginning on page 6, line 1, as follows:

In one embodiment, the hardware environment is included within the Cray-X1 CRAY X1 computer system, which represents the convergence of the Cray T3E CRAY T3E and the traditional Cray parallel vector processors. The X1 is a highly scalable, cache coherent, sharedmemory multiprocessor that uses powerful vector processors as its building blocks, and implements a modernized vector instruction set. In these embodiments, multiple processor unit 102 is a Multi-streaming processor (MSP). It is to be noted that FIG. 1 illustrates only one example of a hardware environment, and other environments (for other embodiments) may also be used.

Dkt: 1376.717US1

# Please amend the paragraph beginning on page 6, line 8, as follows:

FIG. 2 is a block diagram of a parallel processing software environment 200 according to embodiments of the invention. In some embodiments, software environment 200 comprises an operating system that manages the execution of applications 202. Applications may also be referred to as processes. In some embodiments of the invention, the operating system is a UNIX based operating system, such as the <del>Unicos/mp</del> UNICOS/mp operating system from Cray Inc. However, the embodiments of the invention [[is]] are not limited to a particular operating system.

# Please amend the paragraph beginning on page 7, line 26, as follows:

In some embodiments of the inventions, the context shifting event is a "signal." A signal in Unicos/mp UNICOS/mp and other UNIX variations is typically an indication that some type of exceptional event has occurred. Examples of such events include floating point exceptions when an invalid floating point operation is attempted, a memory access exception when a process or thread attempts to access memory that does not exist or is not mapped to the process. Other types of signals are possible and known to those of skill in the art. Additionally, it should be noted that in some operating environments, a signal may be referred to as an exception.

# Please amend the paragraph beginning on page 8, line 5, as follows:

In alternative embodiments, the context shifting event may be a non-local goto. For example, in Unicos/mp UNICOS/mp and other UNIX variants, a combination of "setjmp()" and "longimp()" function calls can establish a non-local goto. In essence, the "setjmp" call establishes the location to go to, and the "longimp" call causes process or thread to branch to the location. The goto is a non-local goto because it causes the execution of the thread or process to continue at a point outside of the scope of the currently executing function. A context shift is required, because the processor registers must be set to reflect the new process or thread execution location.

Dkt: 1376.717US1

## Please amend the paragraph beginning on page 8, line 13, as follows:

In further alternative embodiments, the context shifting event may be a system call. Typically a system call requires that the process or thread enter a privileged mode in order to execute the system call. In Unicos/mp UNICOS/mp and UNIX variants, the system call must typically execute in kernel mode, while normally a process or thread executes in user mode. In order to execute in kernel mode, a context shift is required.

## Please amend the paragraph beginning on page 8, line 18, as follows:

Those of skill in the art will appreciate that other context shifting events are possible and within the scope of the invention inventive subject matter.

## Please amend the paragraph beginning on page 8 at line 20, as follows:

Upon receiving indication of a context shifting event, the operating environment must synchronize the state of the threads associated with the application (block 340) In some embodiments of the invention, this synchronization may be performed by a "gsync" instruction. The operation of the gsync instruction is described in detail in United States Patent Application Serial No. <u>10/643,577</u> [[ ]] entitled "System and Method for Synchronizing Processing Memory Instructions Transfers" which has been previously incorporated by reference.

# Please amend the paragraph beginning on page 9 at line 11, as follows:

As each thread executes in the new context, local synchronization may be required, for example when a thread leaves a system call or when one thread requires results calculated by another thread. In some embodiments of the invention, an "msync" instruction may be used to synchronize the execution of the threads. The operation of the msync instruction is described in detail in United States Patent Application Serial No. 10/643,577 [[ "System and Method for Synchronizing Processing Memory Instructions Transfers" which has been previously incorporated by reference.